48 research outputs found

    Hiding the complexity: building a distributed ATLAS Tier-2 with a single resource interface using ARC middleware

    Get PDF
    Since their inception, Grids for high energy physics have found management of data to be the most challenging aspect of operations. This problem has generally been tackled by the experiment's data management framework controlling in fine detail the distribution of data around the grid and the careful brokering of jobs to sites with co-located data. This approach, however, presents experiments with a difficult and complex system to manage as well as introducing a rigidity into the framework which is very far from the original conception of the grid.<p></p> In this paper we describe how the ScotGrid distributed Tier-2, which has sites in Glasgow, Edinburgh and Durham, was presented to ATLAS as a single, unified resource using the ARC middleware stack. In this model the ScotGrid 'data store' is hosted at Glasgow and presented as a single ATLAS storage resource. As jobs are taken from the ATLAS PanDA framework, they are dispatched to the computing cluster with the fastest response time. An ARC compute element at each site then asynchronously stages the data from the data store into a local cache hosted at each site. The job is then launched in the batch system and accesses data locally.<p></p> We discuss the merits of this system compared to other operational models and consider, from the point of view of the resource providers (sites), and from the resource consumers (experiments); and consider issues involved in transitions to this model

    Analysing I/O bottlenecks in LHC data analysis on grid storage resources

    Get PDF
    We describe recent I/O testing frameworks that we have developed and applied within the UK GridPP Collaboration, the ATLAS experiment and the DPM team, for a variety of distinct purposes. These include benchmarking vendor supplied storage products, discovering scaling limits of SRM solutions, tuning of storage systems for experiment data analysis, evaluating file access protocols, and exploring I/O read patterns of experiment software and their underlying event data models. With multiple grid sites now dealing with petabytes of data, such studies are becoming essential. We describe how the tests build, and improve, on previous work and contrast how the use-cases differ. We also detail the results obtained and the implications for storage hardware, middleware and experiment software

    Tuning grid storage resources for LHC data analysis

    Get PDF
    Grid Storage Resource Management (SRM) and local file-system solutions are facing significant challenges to support efficient analysis of the data now being produced at the Large Hadron Collider (LHC). We compare the performance of different storage technologies at UK grid sites examining the effects of tuning and recent improvements in the I/O patterns of experiment software. Results are presented for both live production systems and technologies not currently in widespread use. Performance is studied using tests, including real LHC data analysis, which can be used to aid sites in deploying or optimising their storage configuration

    Multi-core job submission and grid resource scheduling for ATLAS AthenaMP

    Get PDF
    AthenaMP is the multi-core implementation of the ATLAS software framework and allows the efficient sharing of memory pages between multiple threads of execution. This has now been validated for production and delivers a significant reduction on the overall application memory footprint with negligible CPU overhead. Before AthenaMP can be routinely run on the LHC Computing Grid it must be determined how the computing resources available to ATLAS can best exploit the notable improvements delivered by switching to this multi-process model. A study into the effectiveness and scalability of AthenaMP in a production environment will be presented. Best practices for configuring the main LRMS implementations currently used by grid sites will be identified in the context of multi-core scheduling optimisation

    Establishing Applicability of SSDs to LHC Tier-2 Hardware Configuration

    Get PDF
    Solid State Disk technologies are increasingly replacing high-speed hard disks as the storage technology in high-random-I/O environments. There are several potentially I/O bound services within the typical LHC Tier-2 - in the back-end, with the trend towards many-core architectures continuing, worker nodes running many single-threaded jobs and storage nodes delivering many simultaneous files can both exhibit I/O limited efficiency. We estimate the effectiveness of affordable SSDs in the context of worker nodes, on a large Tier-2 production setup using both low level tools and real LHC I/O intensive data analysis jobs comparing and contrasting with high performance spinning disk based solutions. We consider the applicability of each solution in the context of its price/performance metrics, with an eye on the pragmatic issues facing Tier-2 provision and upgradesComment: 6 pages, 1 figure, 4 tables. Conference proceedings for CHEP201

    ScotGrid: Providing an Effective Distributed Tier-2 in the LHC Era

    Get PDF
    ScotGrid is a distributed Tier-2 centre in the UK with sites in Durham, Edinburgh and Glasgow. ScotGrid has undergone a huge expansion in hardware in anticipation of the LHC and now provides more than 4MSI2K and 500TB to the LHC VOs. Scaling up to this level of provision has brought many challenges to the Tier-2 and we show in this paper how we have adopted new methods of organising the centres, from fabric management and monitoring to remote management of sites to management and operational procedures, to meet these challenges. We describe how we have coped with different operational models at the sites, where Glagsow and Durham sites are managed "in house" but resources at Edinburgh are managed as a central university resource. This required the adoption of a different fabric management model at Edinburgh and a special engagement with the cluster managers. Challenges arose from the different job models of local and grid submission that required special attention to resolve. We show how ScotGrid has successfully provided an infrastructure for ATLAS and LHCb Monte Carlo production. Special attention has been paid to ensuring that user analysis functions efficiently, which has required optimisation of local storage and networking to cope with the demands of user analysis. Finally, although these Tier-2 resources are pledged to the whole VO, we have established close links with our local physics user communities as being the best way to ensure that the Tier-2 functions effectively as a part of the LHC grid computing framework..Comment: Preprint for 17th International Conference on Computing in High Energy and Nuclear Physics, 7 pages, 1 figur

    Herbicide-Resistant Crops: Utilities and Limitations for Herbicide-Resistant Weed Management

    Get PDF
    Since 1996, genetically modified herbicide-resistant (HR) crops, particularly glyphosate-resistant (GR) crops, have transformed the tactics that corn, soybean, and cotton growers use to manage weeds. The use of GR crops continues to grow, but weeds are adapting to the common practice of using only glyphosate to control weeds. Growers using only a single mode of action to manage weeds need to change to a more diverse array of herbicidal, mechanical, and cultural practices to maintain the effectiveness of glyphosate. Unfortunately, the introduction of GR crops and the high initial efficacy of glyphosate often lead to a decline in the use of other herbicide options and less investment by industry to discover new herbicide active ingredients. With some exceptions, most growers can still manage their weed problems with currently available selective and HR crop-enabled herbicides. However, current crop management systems are in jeopardy given the pace at which weed populations are evolving glyphosate resistance. New HR crop technologies will expand the utility of currently available herbicides and enable new interim solutions for growers to manage HR weeds, but will not replace the long-term need to diversify weed management tactics and discover herbicides with new modes of action. This paper reviews the strengths and weaknesses of anticipated weed management options and the best management practices that growers need to implement in HR crops to maximize the long-term benefits of current technologies and reduce weed shifts to difficult-to-control and HR weeds

    Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies

    Get PDF
    Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%–63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with “overprediction” of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation

    A Roadmap for HEP Software and Computing R&D for the 2020s

    Get PDF
    Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for the HL-LHC in particular, it is critical that all of the collaborating stakeholders agree on the software goals and priorities, and that the efforts complement each other. In this spirit, this white paper describes the R&D activities required to prepare for this software upgrade.Peer reviewe
    corecore